343 research outputs found

    CANF-VC++: Enhancing Conditional Augmented Normalizing Flows for Video Compression with Advanced Techniques

    Full text link
    Video has become the predominant medium for information dissemination, driving the need for efficient video codecs. Recent advancements in learned video compression have shown promising results, surpassing traditional codecs in terms of coding efficiency. However, challenges remain in integrating fragmented techniques and incorporating new tools into existing codecs. In this paper, we comprehensively review the state-of-the-art CANF-VC codec and propose CANF-VC++, an enhanced version that addresses these challenges. We systematically explore architecture design, reference frame type, training procedure, and entropy coding efficiency, leading to substantial coding improvements. CANF-VC++ achieves significant Bj{\o}ntegaard-Delta rate savings on conventional datasets UVG, HEVC Class B and MCL-JCV, outperforming the baseline CANF-VC and even the H.266 reference software VTM. Our work demonstrates the potential of integrating advancements in video compression and serves as inspiration for future research in the field

    Transformer-based Image Compression with Variable Image Quality Objectives

    Full text link
    This paper presents a Transformer-based image compression system that allows for a variable image quality objective according to the user's preference. Optimizing a learned codec for different quality objectives leads to reconstructed images with varying visual characteristics. Our method provides the user with the flexibility to choose a trade-off between two image quality objectives using a single, shared model. Motivated by the success of prompt-tuning techniques, we introduce prompt tokens to condition our Transformer-based autoencoder. These prompt tokens are generated adaptively based on the user's preference and input image through learning a prompt generation network. Extensive experiments on commonly used quality metrics demonstrate the effectiveness of our method in adapting the encoding and/or decoding processes to a variable quality objective. While offering the additional flexibility, our proposed method performs comparably to the single-objective methods in terms of rate-distortion performance

    Transformer-based Variable-rate Image Compression with Region-of-interest Control

    Full text link
    This paper proposes a transformer-based learned image compression system. It is capable of achieving variable-rate compression with a single model while supporting the region-of-interest (ROI) functionality. Inspired by prompt tuning, we introduce prompt generation networks to condition the transformer-based autoencoder of compression. Our prompt generation networks generate content-adaptive tokens according to the input image, an ROI mask, and a rate parameter. The separation of the ROI mask and the rate parameter allows an intuitive way to achieve variable-rate and ROI coding simultaneously. Extensive experiments validate the effectiveness of our proposed method and confirm its superiority over the other competing methods.Comment: Accepted to IEEE ICIP 202
    • …
    corecore